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ABSTRACT: As enrolment in online courses has grown and LMS data has become accessible for 
analysis, researchers have begun to examine the link between in-course behaviours and course 
outcomes. This paper explores the use of readily available LMS data generated by approximately 
700 students enrolled in the 12 online courses offered by Pamoja Education, the course provider 
for the International Baccalaureate, in 2012-2013. The findings suggest that LMS data sets can 
indeed provide useful information on the relationship between online behaviours and final 
grades; that higher levels of online behaviours are associated with higher performance; that two 
types of behaviour, one associated with attendance and the other associated with interactivity, 
operate separately; and that these two types of behaviour function differently depending on 
gender. 
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1 INTRODUCTION 

The last decade has seen exponential growth in enrolments in online courses at the high school level. In 
2003, there were approximately 330,000 enrolments among students in public schools in the United 
States (NCES, 2005). By 2010, there were approximately 1.8 million, 74% at the high school level (NCES, 
2011). There were another 300,000 full-time online students in charter schools (Watson, Pape, Murin, 
Gemin, & Vashaw, 2014). The vast majority of these students are taking one or two courses, generally 
because a course is not offered at their school but also to recover credits for failed or missed courses, to 
free up their schedules, or to gain experience with an online course before college (iNACOL, 2013). In a 
growing number of U.S. states, at least one online course is required for high school graduation (Watson 
et a I., 2014). These online courses may be provided by state or district virtual schools, by virtual charter 
schools, or by private providers. At the high school level, online courses are generally asynchronous but 
can be either self-paced or follow a cohort-based weekly schedule. The former has been the most 
common model for high school courses, while the latter is more common in higher education. 


Online courses generate streams of data from the Learning Management System (LMS) that can be used 
to provide insights into student behaviour in the online environment, especially as it relates to student 
success. However, although online learning at the K-12 level has grown rapidly, research using LMS data 
at this level remains sparse (Lowes, 2014), especially compared to the amount of work in higher 
education. In addition, as we will see below, the existing research has looked at different variables and 
has yielded mixed and sometimes contradictory results (Davies & Graff, 2005; Dawson, McWilliam & 
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Tan, 2008; Hung, Hsu, & Rice, 2012; Hung & Zhang, 2008; Liu & Cavanaugh, 2011a, 2011b, 2012; 
Macfadyen & Dawson, 2010; Ramos & Yudko, 2008; Ryabov, 2012; Wang & Newlin, 2000; Wei, Peng, & 
Chou, 2015). This paper uses LMS data generated by approximately 700 high school students who were 
enrolled in 12 asynchronous cohort-paced online courses offered by Pamoja Education (PJE), the course 
provider for the International Baccalaureate (IB), in order to explore the link between LMS behaviours 
and course outcomes at the high school level. 

There are at least three types of interaction in online courses: learner-content, learner-teacher, and 
learner-learner (Moore, 1989). It is the emphasis on learner-learner interaction that distinguishes the 
cohort-based model of online learning adopted by PJE from self-paced models, which are more akin to 
independent study or tutoring. The inclusion of student-student interaction is central to the design of 
online courses based on constructivist learning theories (Anderson, 2003) because constructivists 
believe that learners need to co-construct knowledge — and therefore to interact with each other — in 
order to learn and retain what they have learned (Jonassen, 1999). In higher education, student-student 
interaction in online courses has long been considered essential for both learning and for motivation 
(Bernard et al., 2009). In the K-12 online environment, the need for student-student interaction has 
been integrated into course quality standards (i.e., Southern Regional Education Board, 2006; iNACOL, 
2011). IB courses are explicitly constructivist in design (International Baccalaureate, 2013) and PJE's 
course designers build in multiple opportunities for student-student interaction, primarily through 
discussion forums and group projects. We were therefore particularly interested in the contribution of 
LMS behaviours relating to student-student interaction to course outcomes. 

2 PRIOR RESEARCH ON ONLINE BEHAVIOURS 

The research using LMS data has found that higher levels of activity are almost always associated with 
better outcomes (as measured by final grades) and greater student satisfaction (for a review, see Cho & 
Kim, 2013). In looking at this literature from a constructivist perspective, it is useful to adapt a 
distinction that Chapman (2003) made for face-to-face learning, between activity-as-participation — for 
instance, attending class and submitting assignments — and activity-as-interaction — the sustained 
involvement in learning activities involving cognitive, behavioural, and affective aspects. In face-to-face 
classrooms, activity-as-participation is measured in a number of ways, including attendance, number of 
homework or other assignments submitted, and time on task. In online courses, the most easily 
accessible counterparts to these measures are a combination of frequency and duration variables 
(Morris, Finnegan, & Wu, 2005): number of logins, number of pages accessed, number of assignments 
submitted, time spent in the system, etc. In what follows, we will call these attendance variables. For 
the online counterparts to classroom interaction, the most accessible and frequently used LMS variables 
are discussion forum posts viewed and discussion forum posts authored. In what follows, we will call 
these interactivity behaviours. Taken together, these become a measure of overall student engagement. 
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As researchers have searched for behaviours associated with success, the kinds of activity they have 
analyzed have differed, as have their results. A number of studies have found that only attendance 
behaviours are correlated with final grades. For example, in higher education Wang and Newlin (2000) 
looked at homepage hits, posts read, and posts written for 51 students in three sections of a Psychology 
course. They found that the one frequency behaviour — homepage hits — alone predicted student 
grades. Similarly, Ramos and Yudko (2008), examining the same three variables, found that page hits 
was the only variable that had a positive relationship with final grades. On the other hand, Ryabov 
(2012), looking at 286 students in online introductory sociology courses at one university, found that 
only the duration behaviour of time spent was significant. 

Other studies have found that interactivity behaviours were also important. For example, Flung and 
Zhang (2008), looking at 98 students in an undergraduate business course, found that participation in 
online discussions had a stronger correlation with performance than accessing course materials. Wei, 
Peng, and Chou (2015), looking at 381 undergraduates in a general education course, found that 
number of discussion board postings and frequency of viewing reading materials, along with frequency 
of logins, were positively correlated with final exam scores. Macfadyen and Dawson (2010), using data 
from 118 students in five different biology courses, found that it was number of messages posted, 
number of email messages sent, and number of assessments completed that had positive correlations 
with final grades. Morris, Finnegan, and Wu (2005), looking at over 423 students in three undergraduate 
education courses, found that it was discussion posts viewed, as well as content pages viewed and time 
spent viewing discussions, that had positive correlations with final grades. Similarly, Dawson, McWilliam, 
and Tan (2008), looking at a 1,000-student undergraduate science class, found that more time spent 
online and more participation in discussions were associated with higher final grades. In contrast, Davies 
and Graff (2005), looking at 122 students over the course of a year, found no relationship between 
discussion forum activity and final grades. 

To date, there have been very few studies using LMS data at the high school level. While the research in 
higher education has often focused on one or two classes or a single subject, the high school studies 
have looked at large numbers of students across many subjects; this adds subject area as a complicating 
factor. Thus Liu and Cavanaugh (2011b, 2012), using data for the 2007-2008 academic year at one state- 
run online high school, looked at eight possible predictors of end-of-course exam results for two biology 
and four algebra courses, with a total of 662 students. The predictors included one frequency variable 
(number of logins) and one duration measure (total minutes spent in the system) but no measures of 
student-student interaction (presumably because these were self-paced courses). They found a mixed 
picture: number of logins was not correlated with final grades for the two biology courses but was 
correlated for three of the four algebra courses, while time spent logged in was correlated with final 
grades for both biology courses and three of the four algebra courses. Further, when they examined a 
wider range of 15 courses, with a total of 1,794 students, they found that number of logins was 
correlated with final grades in only three of the 12 courses but time spent logged in was significant in 11 
(Liu & Cavanaugh, 2011a). 
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While Liu and Cavanaugh looked at three behaviours, Hung, Hsu, and Rice (2012) looked at seven, using 
data from approximately 7,000 students at a statewide online school for the 2009-2010 academic year. 
Most of the behaviours were frequency behaviours (clicks, course content accessed, course access, page 
access, tab access, module access), with only one related to student-student interaction (number of 
discussion board posts) and none related to duration. They found a positive relationship between all 
these behaviours and performance, although there were variations from course to course, depending on 
subject area and on course level. 

All of these findings suggest that there is indeed a link between student activity and outcomes as 
measured by final grades. However, the behaviours chosen for analysis have varied, in part depending 
on what is available from the LMS and in part on which behaviours each researcher feels are important. 
In addition, most have used first-generation statistical techniques for their data analysis (i.e., Pearson's 
correlations, multiple regression, hierarchical linear modelling, decision tree and cluster analysis), which 
only allow the investigation of a single layer of the relationship(s) between the explanatory variable(s) 
(i.e., the behaviours) and the outcome variable (course grade). We hoped that we could overcome the 
limitations in previous studies by building models that hypothesize multiple layers of relationships and 
test these interrelationships at one time using structural equation modelling (SEM), a family of methods 
that are second-generation statistical techniques. For instance, since all of the literature suggests a link 
between online behaviours and learning outcomes, we could explore whether such behaviours can be 
explained by a latent construct that cannot otherwise be directly measured. In addition, one important 
advantage of SEM is its ability to explicitly estimate the unreliability of the observed variables (i.e., 
measurement error), whereas standard linear regression modelling assumes that variables are observed 
without error (Bollen & Long, 1993; Gerow, Grover, Roberts, & Thatcher, 2010; Kline, 2011). 

Although only a few researchers have considered whether there are gender differences in LMS 
behaviours, their findings suggest that male and female students may approach online learning 
differently, with different degrees and styles of participation (Yukselturk & Bulut, 2009; Rovai, 2001). For 
example, McSporran and Young (2001), analyzing data from a college-level web design course, found 
that women showed consistently higher levels of activity than males in their online classes, completed 
more assignments, seemed to be better at self-regulation, and performed better. Similarly, Hung, Hsu, 
and Rice (2012) also found that females performed better and were more active than males. Johnson 
(2011), analyzing data from a large information systems course, found that females' higher levels of 
interaction and general sociability were an advantage in online courses and likely to lead to better 
outcomes for females than males. We therefore felt that it was important to explore the role of gender 
in our analysis. 

3 RESEARCH QUESTIONS 

While some of the researchers cited above found that only the attendance behaviours were significantly 
correlated with final grades, others found that interactivity behaviours were important as well. For 
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those who believe that student-student interaction is critical for learning, both types of activity should 
show such a relationship. Desire2Learn, PJE's LMS, provided us with a limited number of behaviours. 
These included three attendance behaviours — number of days accessed, number of logins, and time 
spent logged in — and two interactivity behaviours — posts viewed and posts authored. Using this data, 
we addressed the following questions: 

RQ1: Is there a relationship between the students' course behaviours and their course performance? 
RQ2: Is there a relationship between the students' course behaviours, when grouped into attendance 
and interactivity behaviours, and their course performance? 

RQ3: Should gender be taken into consideration in the analysis of course behaviours and course 
performance? 

4 THE RESEARCH SITE 

Our study setting was the 12 asynchronous online courses offered by Pamoja Education (PJE), the course 
provider for the International Baccalaureate (IB), in 2013-2014. The courses were online, fully 
asynchronous, and follow a cohort-paced weekly schedule, similar to most online courses in higher 
education. Some courses had only one section, while others had as many as six, for a total of 39 
sections, with approximately 20 students per section and a total of 798 students. Most of the students 
were taking these courses as part of the IB Diploma, a challenging program for students in their last two 
years of high school, but some were taking them as single courses. All of the courses lasted two years. 
They included Business Management, Economics, Film, Information Technology in a Global Society, 
Mathematics, Philosophy, Psychology, Mandarin, and Spanish. The students not only completed 
readings, wrote essays, and submitted other assignments but were expected to interact with each other 
in structured, facilitated discussion forums and to engage in multi-week group projects. Discussion 
forum posts were not graded but discussion forum participation was part of the course evaluation rubric 
and in that way became part of the final grade. 

5 COURSE BEHAVIOURS AND COURSE PERFORMANCE: EXPLORATORY 
ANALYSIS 

5.1 Course Performance 

The sample began with the entire cohort of 798 students in the first year of their courses. The gender 
makeup was 55% female (n = 439) and 45% male (n = 359). In the IB system, grades are numeric, not 
alphabetical, and range from 1 to 7, with 4 to 7 considered passing grades. Of the 798 students enrolled 
as of the second week of the academic year (chosen as the beginning point because students were still 
enrolling in week 1), 689 received a year-end grade, with most of those who dropped doing so within 
the grace period (i.e., without penalty). As Table 1 shows, 21% of the students were not passing at the 
end of the fall semester and 23% at year-end: 
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Table 1. Percentage receiving each grade 


End of fall semester End of academic year 

Final grade Count Percent Count Percent 


7 

143 

18% 

96 

14% 

6 

194 

24% 

143 

21% 

5 

175 

22% 

151 

22% 

4 

115 

15% 

143 

21% 

1-3 

166 

21% 

156 

23% 

Total 

793 

100% 

689 

100% 


In addition, as is common at the high school level (Voyer & Voyer, 2014), females on average had higher 
grades than males at both points in time, and a higher percentage of females than males received the 
highest grade of 7 (Table 2). 


Table 2. Percentage of passing students by grade category by gender 



End of fall semester 

End of academic vear 



Mean grade 

% passed n/ . . 

(4-7) %W ' th7 

Mean grade 

% passed 
(4-7) 

% with 7 

Female 

5.1 

85% 23% 

4.9 

81% 

20% 

Male 

4.5 

72% 12% 

4.4 

74% 

7% 


5.2 Descriptive Statistics of Course Behaviours 

The LMS output came to us as data for each student cumulated for each week that he or she was 
enrolled. PJE also provided us with final grades. 

As noted above, the LMS behaviours related to attendance were: 

• # days: The number of days a student accessed the system; 

• # logins: The number of times a student logged into the system; 

• Session duration: The total hours a student spent logged in 1 . 

The LMS behaviours related to interactivity were: 

• Posts viewed: The number of posts a student viewed. 

• Posts authored: The number of posts a student wrote. 

Table 3 gives the means for these behaviours for the 689 students who completed the year. When the 
weekly LMS behaviours are totalled and then averaged over the 32 weeks, we see that students logged 
in between 3 and 4 days a week and spent about half an hour per login. As would be expected, they 
viewed many more posts than they authored. However, females were more active than males for all 


1 Session duration was the amount of time from login to logout. If the student did not deliberately log out, the system logged 
them out after 20 minutes. This figure may therefore be a slight overestimate. 
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behaviours. This was the first hint that, as suggested by other research, there are gender differences in 
online behaviours (Hung, Hsu, & Rice, 2012; Lowes & Lin, 2015). 


Table 3. Mean course behaviours, all weeks (2-33) 



# days 

# logins 

Time spent 
(# hours) 

# posts 
viewed 

# posts 
authored 

All 

3.6 

8.0 

5.2 

14.5 

1.5 

Female 

3.8 

8.7 

5.4 

15.8 

1.6 

Male 

3.4 

7.2 

4.9 

13.0 

1.4 


The differences between males and females were greatest for number of logins and number of posts 
viewed (Figure 1). 



Figure 1. Mean course behaviours by gender 


However, there was a wide range for all behaviours and for both genders (Table 4). 


Table 4. Range for course behaviours, all weeks (2-33) 



# days 

# logins 

Time spent 
(# hours) 

# posts 
viewed 

# posts 
authored 

All 

0.0-6.7 

0.0-34.1 

0.0-18.8 

0.0-69.3 

0.0-8.9 

Female 

0.4-6.7 

0.4-34.1 

0.2-18.8 

0.0-68.5 

0.0-8.9 

Male 

0.0-6.5 

0.0-24.5 

0.0-17.1 

0.0-69.3 

0.0-6.6 


In addition, the means obscure a skewed distribution for all behaviours but number of days. Figure 2 
shows the raw counts for weeks 2-33 combined. This posed a problem for analysis that we dealt with by 
using a robust estimator. 2 


2 Due to the violations of normality assumption for each behaviour as well as the multivariate normality assumption (p < .001 
for all), WLSMV was selected for the confirmatory factor analysis later as it is robust to non-normality. 
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0 10 20 30 40 


#logins 





((hours ((viewed ((authored 


Figure 2. Frequencies for each behaviour 

Looked at over time, gender differences held for all three attendance behaviours for all but a few of the 
final weeks (Figure 3). Note that week 25 was a break week. 



2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 

Week 

Figure 3. Mean course attendance behaviours by gender, all weeks 

The gender differences also held for both interactivity variables, with females both viewing and 
authoring more posts in almost every week (Figure 4). 
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Week 

Figure 4. Mean course interactivity behaviours by gender, all weeks 

Overall, this data suggested that there were differences in the behaviour of males and females, with 
females being more active than males for both the attendance and the interactivity behaviours. 

5.3 Bivariate Correlations of Course Behaviours 

Table 5 summarizes the linear correlations between pairs of behaviours. It shows that all behaviours are 
correlated with one another (p < .001) but that the correlations are stronger among the three 
attendance behaviours, and between the two interactivity behaviours, than between any one of the 
three attendance behaviours and either of the two interactivity behaviours. The fact that this held true 
for all students and for both genders suggested that two types of behaviour are indeed present. 


Table 5. Linear correlations between course behaviours for all students, females, and males 



Attendance behaviour 

Interactivity behaviour 

All (N = 798) 

# days 

# logins 

# hours 

# viewed # authored 

# days 

- 




# logins 

857 * * * 

- 



# hours 

. 713 *** 

. 685 *** 

- 


# posts viewed 

.501*** 

.435*** 

.370*** 

- 

# posts authored 

.562*** 

.496*** 

.465*** 

. 740 *** 

Females (n = 439) 

# days 

# logins 

# hours 

# viewed # authored 

# days 

- 




# logins 

. 838 *** 

- 



# hours 

707 *** 

. 700 *** 

- 


# posts viewed 

.485*** 

.398*** 

.375*** 

- 

# posts authored 

.552*** 

.480*** 

.487*** 

716 *** 
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Males (n = 359) 

# days 

# logins 

# hours 

# viewed 

# authored 

# days 

# logins 

# hours 

# posts viewed 

. 886 *** 

.716*** 

.509*** 

.667*** 

.481*** 

.355*** 



# posts authored 

.570*** 

.520*** 

.433*** 

765 *** 

- 


Note: ***p < .001. 


5.4 Course Behaviours and Final Grades 

When we look at course behaviours by final grade (Table 6), we see that higher grades are associated 
with higher mean course behaviours (i.e., with each behaviour averaged over the 32 weeks). We also 
see that students who did not pass their courses (grades of 1-3) were far less active than students who 
did (grades of 4-7), no matter which behaviour is considered. 


Table 6. Final grades with mean behaviours. 


Final grade 

# days 

# logins 

Time spent 
(ft hours) 

# posts 
viewed 

# posts 
authored 

7 


4.6 

11.0 

6.8 

17.9 

2.2 

6 


4.4 

9.8 

6.5 

18.6 

2.1 

5 


4.0 

8.6 

5.5 

19.7 

1.9 

4 


3.5 

7.5 

5.0 

14.0 

1.5 

1-3 


2.8 

5.9 

3.7 

8.0 

0.9 

able 7 shows that this was the case for both genders. 





Table 7. Final grades with mean behaviours by gender. 



Final 

grade 

# days 

# logins 

Time spent 

# posts 
viewed 

# posts 
authored 

Female 

7 

4.6 

11.3 

6.6 

17.4 

2.1 


6 

4.5 

10.2 

6.8 

19.5 

2.1 


5 

4.2 

9.4 

5.8 

20.2 

1.9 


4 

3.7 

8.1 

5.1 

15.1 

1.5 


1-3 

3.0 

6.5 

3.8 

10.3 

1.0 

Male 

7 

4.5 

10.0 

7.6 

19.8 

2.6 


6 

4.3 

9.4 

6.3 

17.7 

2.1 


5 

3.7 

7.5 

5.0 

19.0 

1.9 


4 

3.3 

7.0 

4.8 

12.9 

1.4 


1-3 

2.6 

5.3 

3.6 

6.0 

0.7 


The difference is particularly evident when we compare the behaviours of passing and not-passing 
students (Figure 5). The largest difference is in the interactivity behaviour of posts viewed. 
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Passing 

Not-passing 



# days # logins # hours # posts viewed # posts 

authored 

Figure 5. Mean course behaviours, passing and not-passing students 


Figure 6 shows that this difference holds true for both genders. It also shows that, while passing males 
and females viewed similar numbers of posts, females who were not passing were more active in 
viewing posts than males who were not passing. 

Females Males 



# days # logins # hours # posts # posts 

viewed authored 



Figure 6. Mean course behaviours, passing and not-passing students, by gender 


Table 8 summarizes the linear correlations. These provide statistical evidence that all five behaviours are 
correlated with final grades (p < .001 for all), but that the interactivity behaviours are not as strongly 
correlated as the attendance behaviours. This is not surprising given that not all courses emphasize 
interaction to the same extent and that some types of interaction are not captured by post-related 
behaviours. However, the strengths of the correlations with final grades are consistently larger for males 
than females; this is particularly the case with the interactivity behaviours. 
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Table 8. Linear correlations between each course behaviour and final grade 


Attendance behaviour Interactivity behaviour 



# days 

# logins 

Time spent 

# posts 
viewed 

# posts 
authored 

All 

.565*** 

.440*** 

.390*** 

244 *** 

.365*** 

Female 

.517*** 

.387*** 

.355*** 

.172** 

.287*** 

Male 

.591*** 

474*** 

.416*** 

.312*** 

. 454 *** 


Note: **p < .01, ***p < .001. 


In other words, the level of activity explains less about female performance than male performance. In 
addition, the weak correlations between both posts viewed and post authored and final grades for 
females suggests that higher levels of activity in these two behaviours in particular do not translate into 
higher grades. 

5.5 Summary of Exploratory Analysis 

This exploratory analysis provides preliminary insights into the relationships among course behaviours 
and between course behaviours and course performance. First, for all behaviours, higher levels of 
activity are associated with higher grades. Second, the correlations among the three attendance 
behaviours and between the two interactivity behaviours suggest that two types of behaviour are 
present. However, while higher levels of activity for both types of behaviour are associated with higher 
final grades, the correlations between each interactivity behaviour and final grades are lower than the 
correlations between each attendance behaviour and final grades. Third, the existence of gender 
differences — females were more active than males for all behaviours but the strength of the 
correlations between each behaviour and final grades was higher for males than females — suggests 
that gender needs to be taken into consideration in any further exploration of the relationship between 
course behaviours and course performance. 

6 COURSE BEHAVIOURS AND COURSE PERFORMANCE: STATISTICAL 
MODELS 

We next wanted to explore whether multiple behaviours taken together can tell us more about the 
relationships between the behaviours and course performance than any single behaviour can. Since the 
five behaviour variables were measured on different scales and the sample variances exceeded 1 to 10 , 
which led to convergence problems, the five behavioural variables were standardized to have a mean of 
0 and a standard deviation of 1 . 
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6.1 Exploring Course Behaviours 


As the first step, we hypothesized that the five observed behaviours share a common domain that can 
be explained by an unobserved latent factor that we will call course engagement. Figure 7 shows the 
conceptual model, with E indicating allowance for measurement error: 



Figure 7. Conceptual model of course engagement 


To investigate this, we first performed confirmatory factor analysis (CFA) using Mplus 7.1 (Muthen & 
Muthen, 2013). 3 CFA Model 1 examines course engagement for all students. The model fit indices show 
that the model fits the data well (see Appendix, Table 1, for all CFA model fit indices). Figure 8 shows the 
standardized factor loadings and coefficients for this model. It provides statistical evidence that the 
latent factor, engagement, explains all five LMS behavioural variables (p < .001 for all). 4 



Figure 8. Standardized parameter estimates for CFA Model 1 


While CFA Model 1 is a good model fit, it does not allow us to examine the role of gender. To do this, we 
ran the separate analyses for females (CFA Model la) and males (CFA Model lb). Flowever, although 
both models fit the data well, this introduced the chance of making a Type 1 error so we then looked at 


3 Since the nature of LMS data is limited in the sense that each variable may not be measured independently (for example, 
number of days accessed and number of logins could be somewhat inclusive), this may lead to correlated errors (variance that 
is unexplained) among the variables. To deal with this, the model specification included three freely estimated relationships 
between # days and # logins, # logins and # hours (time spent), and # posts viewed and # posts authored (without freeing the 
relationship between # days and # hours). Note that correlated variables are linked by curved lines. 

4 For unstandardized factor loadings and coefficients for all the analyses, please contact the corresponding author. 
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all students with gender considered as a grouping variable (CFA Model 2). Once again, it was a good 
model fit. Figure 9 shows the standardized factor loadings and coefficients for this model. It provides 
statistical evidence that the latent factor engagement explains all five LMS behaviour variables for both 
females and males when they are included in the same model (p < .001 for all). 


Females Males 




Figure 9. Standardized parameter estimates for CFA Model 2 


Although both CFA Model 1 and CFA Model 2 are helpful in explaining the relationship between 
engagement and the behavioural variables, we wanted to know if the proportion of variance explained 
in the behaviours by the latent factor engagement differed for the two models. Table 9 compares the R- 
squared information. 


Table 9. R-squared information for CFA Models 1 and 2 


Model 

Gender 

# days 

# logins 

# hours 
(time spent) 

# posts 
viewed 

# posts 
authored 

CFA Model 1 

Not considered 

.871 

.627 

.532 

.266 

.349 

CFA Model 2 

Females 

.879 

.505 

.495 

.239 

.314 

CFA Model 2 

Males 

.948 

.827 

.493 

.261 

.327 


The greatest difference between males and females is in the number of logins. CFA Model 1, which does 
not consider gender, shows that course engagement explains about 63% of the variance in logins. 
However, CFA Model 2 shows that course engagement explains much more of the variance in the login 
behaviour of males (83%) than females (51%). In other words, taking gender into consideration shows 
that there are differences between males and females that we would not have seen otherwise. In 
addition, the latent variable engagement explains more of the attendance behaviours than the 
interactivity behaviours. 
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6.2 Exploring the Relationship between Course Behaviours and Final Grades 

The second step was to use structural equation modelling (SEM) to extend CFA Models 1 and 2 in order 
to examine the relationship between the five behaviours and final grades. 5 Figure 10 shows the 
conceptual model. 



Figure 10. Conceptual model for relationship between engagement and final grade 


SEM Model 1 is a good model fit (see Appendix, Table 2, for model fit indices). Figure 11 shows the 
standardized factor loadings and coefficients for SEM Model 1. It provides statistical evidence that all 
five behaviours are measured by course engagement (p < .001 for all), and that higher levels of 
engagement increase the predicted probability of higher final grades (y n = .558, p = .000, p < .001). 6 In 
other words, the model confirms a positive relationship between course behaviours and course 
performance regardless of gender. 



Figure 11. Standardized parameter estimates for SEM Model 1 


5 We did not consider the approach of multilevel analysis, since being in a different section (total number of sections = 39) only 
explained about 7% of the variability in the final grades (analysis performed using HLM 7). The WLSMV estimator was selected 
for estimation because the outcome variable was categorical and non-normal with a ceiling effect. 

6 Since the final grade was categorical (1-7), we interpret the coefficients as probit regression coefficients. 
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We next expanded SEM Model 1 to include gender: SEM Model 2 is also a good model fit (see Appendix, 
Table 2). Figure 12 shows the standardized factor loadings and coefficients for females and males this 
model. It provides statistical evidence that higher levels of engagement increase the predicted 
probability of higher final grades for both females (y n = .499, p = .000, p < .001) and for males (yn = 
.599, p = .000, p < .001). In other words, the model confirms a positive relationship between course 
behaviours and course outcomes for both females and males. 7 

Females Males 



Figure 12. Standardized parameter estimates for SEM Model 2 


Table 10 compares the proportion of variance explained by SEM Models 1 and 2. With gender not 
considered, SEM Model 1 explains about 31% of the variance in final grades ( R 2 = .312 = .558 2 ). However, 
when gender is considered (SEM Model 2), the model explains only 25% of the variance in the final 
grades for females (R 2 = .249 = .499 2 ) but 36% for males (R 2 = .359 = ,599 2 ). 


Table 10. /7-squared information for SEM Models 1 and 2 


Model 

Gender 

R 2 

SEM 1 

Not considered 

.312 

SEM 2 

Females 

.249 

SEM 2 

Males 

.359 


In other words, course engagement tells us more about the course outcomes for males than females. 
Although the proportion of variance explained is only medium in a traditional sense (Leech, Barrett, & 
Morgan, 2011), it needs to be remembered these engagement behaviours are only part of the larger 
picture of what a student does while in an online course. 


7 As an alternative to SEM Model 2, we made gender a fixed factor and formally tested the difference between females and 
males. This model was also a good model fit. The results confirmed that females had an increased predicted probability of 
higher engagement (y n = .165, p = .000, p < .001) and higher final grades (y 2 i = .092, p = .013, p < .05). (See Appendix, Table 4, 
for model fit indices, and Appendix, Figure 1, for the conceptual model and parameter estimates.) 
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6.3 Exploring the Relationship among Course Behaviours (With Attendance and 
Interactivity Behaviours Taken Separately) and Final Grades 

Section 6.2 provided statistical confirmation that there is a relationship between the five behaviours 
(represented by the latent factor engagement) and final grade. However, the exploratory analyses 
suggested that there also may be two sets of behaviours, attendance and interactivity, that may not 
have the same relationship to course performance. We therefore built an alternative two-factor model, 
with and without gender. 8 Figure 13 shows the conceptual model. 



Figure 13. Conceptual model for attendance and interactivity 

SEM Model 3 fits the data well (see Appendix, Table 3). Figure 14 shows the standardized factor loadings 
and coefficients for this model. These provide statistical evidence that number of days accessed, 
number of logins, and time spent are explained by attendance (p < .001 for all) and that number of posts 
viewed and posts authored are explained by interactivity (p < .001 for both). However, while there is 
evidence that higher levels of attendance increase the predicted probability of higher final grades (yn = 
.538, p = .000, p < .001), there is not sufficient evidence to conclude that higher levels of interactivity do 
so (yn = .022, p = .682, p > .05). Thus although we had earlier found a positive relationship between the 
latent factor engagement and final grades, we now find that when the two aspects of engagement are 
considered, only attendance behaviours are correlated with final grades. 


8 As we only had two variables (indicators) for the latent "interactivity" factor, asking to estimate the correlated errors between 
posts viewed and authored led to model specification issues. Therefore, the correlated errors were not estimated here. 
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Figure 14. Standardized parameter estimates for SEM Model 3 


SEM Model 4 adds in gender and also fits the data well (see Appendix, Table 3). Figure 15 shows the 
standardized factor loadings and coefficients for females and males for this model. It provides statistical 
evidence that for males, both higher levels of attendance (y n = .447, p = .000, p < .001) and higher levels 
of interactivity (y i2 = .173, p = .007, p < .01) increase the predicted probability of higher course grades 
but for females, only higher levels of attendance do so (yn = .588, p = .000, p < .001). In fact, it suggests 
that higher levels of interactivity may actually decrease the probability of higher course grades for 
females, although there is not enough statistical evidence to conclude that this is the case (y i2 = -.104, p 
= .226, p > .05). 
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Figure 15. Standardized parameter estimates for SEM Model 4 


Table 11 compares the proportion of variance explained by SEM Models 3 and 4. When gender is not 
considered, the two-factor model explains about 31% of the total variance in final grades (R 2 = .306). 
When gender is considered, the model explains 27% of the variance in the final grades for females ( R 2 = 
.274) and about 33% for males (R 2 = .327): 
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Table 11. /7-squared information for SEM Models 3 and 4 


Model 

Gender 

R 2 

SEM Model 3 

Not considered 

.306 

SEM Model 4 

Females 

.274 

SEM Model 4 

Males 

.327 


If we compare the SEM Model 2 and SEM Model 4 (Table 12), both of which consider gender, we see 
that, although both models fit the data very well, the ft-squared information suggests that SEM Model 2, 
with course engagement as a single factor, explains males' course performance better, but that SEM 
Model 4, with attendance and interactivity as two separate factors, explains females' course 
performance better, although the difference between the models for females is less than for males. 


Table 12. Comparison of SEM Models 2 and 4 


Model 

/?-squared 

Female 

Male 

SEM Model 2 (single-factor) 

.249 

.359 

SEM Model 4 (two-factor) 

.274 

.327 


6.4 Summary of Statistical Models 

The statistical models confirm the relationship between course behaviours and course performance for 
all students when gender is not considered: Higher levels of all five course behaviours are associated 
with higher grades. However, when attendance and interactivity behaviours are looked at separately, 
higher levels of attendance behaviours are associated with higher grades but higher levels of activity 
behaviours are not. 

When gender is considered, higher levels of all five behaviours are correlated with higher grades for 
both males and females. However, when attendance and interactivity behaviours are looked at 
separately, higher levels of attendance behaviours are associated with higher grades for both genders 
but higher levels of interactivity behaviours are associated with higher grades only for males. 

7 DISCUSSION 

This research explored the link between course behaviours and final grades in 12 cohort-paced 
asynchronous online courses for high school students. The LMS provided five behaviours, three of which 
were considered measures of attendance (number of days accessed, number of logins, and time spent 
logged in) and two of which were considered measures of student-student interactivity (posts viewed 
and posts authored). Treating attendance separately from interactivity was important because these 
courses were designed from a social constructivist perspective, in the belief that students will learn 
more if they not only engage with the material and with the teacher, but if they engage with each other 
as well (Jonassen, 1999). In other words, although the course designers believe that both types of 


ISSN 1929-7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution - NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) 


187 









JOURNAL OF LEARNING ANALYTICS 


S 3LAR 

SOCIETY for LEARNING 
ANALYTICS RESEARCH 

(2015). Exploring the link between online behaviours and course performance in asynchronous online high school courses . Journal of Learning 
Analytics, 2(2), 169-194. http://dx.doi.org/10.18608/jla.2015.22.13 

engagement are necessary for success, constructivist pedagogy holds that higher levels of student- 
student interactivity are particularly important for better outcomes. 

The results of the exploratory analysis and the statistical models show that the five behaviours both 
analyzed individually and when combined into a latent factor that we called engagement, were 
positively correlated with final grades (RQ1). In addition, the bivariate correlations show that although 
higher levels of both attendance and interactivity behaviours are correlated with higher final grades, the 
correlation is stronger for the attendance behaviours than the interactivity behaviours. However, in the 
statistical models, which look at all five behaviours simultaneously, only attendance behaviours are 
correlated with final grades; the interactivity behaviours are not (RQ2). 

Adding gender (RQ3) reveals otherwise hidden differences about the relationships between the five 
behaviours and final grades. First, despite the fact that females were more active than males and had 
higher grades overall, the course behaviours — represented by the latent factor engagement — have a 
considerably stronger correlation with males' final grades than with females' final grades. Second, 
although higher levels of attendance behaviours were associated with higher final grades for both 
genders, higher levels of interactivity behaviours were associated with high final grades for males but 
not for females. 

The results partially confirm the constructivist belief that student-student interactivity contributes to 
learning outcomes, since when we move from bivariate correlations to statistical models, the 
relationship does not seem to hold for females. This is a puzzling finding. Why are the interactivity 
behaviours less important for females than males when females are so much more active? It seems 
possible that females approach their online courses differently from males (Yukselturk & Bulut, 2009). 
For example, it may be that females in online learning situations have a more social orientation 
(Johnson, 2011; Arbaugh, 2000), or that females more than males focus on building relationships in such 
sites of interactivity as discussion forums (Rovai, 2001). In other words, females' diligence in reading and 
posting may not translate into higher grades. To find out if this is the case, we need qualitative analyses 
of both viewing and posting behaviours in order to see if males and females are indeed behaving 
differently. But we also need to note that it may be that there may be variables that we did not have 
access to, such as academic history, attitudes toward school, or internal locus of control in online 
courses (Lowes & Lin, 2015), that would better explain all these behaviours. 

In terms of the approach to the analysis, the literature cited in Section 2 has in every case assumed a 
simple one-layer relationship between each researcher's chosen learning behaviours and course 
performance. Structural equation modelling (SEM) allows additional layers (Gerow et al., 2010). This 
makes SEM a useful addition to the analysis of complicated relationships (Bollen & Long, 1993; Kline, 
2011 ) because it makes it possible to move beyond identifying the importance of individual behaviours 
toward further exploring the multifaceted complexities of online learning. 
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There are a number of limitations to our use of this LMS data set. First, although LMS behaviours can 
provide insights into what is happening inside courses, they are a digital trail that provides at best a 
partial view. In addition, LMS output can differ from LMS to LMS, depending on the access architecture 
(Campbell & Oblinger, 2007; Richards, 2011). We purposefully used LMS data easily available from most 
learning management systems and conducted relatively easy to replicate analyses. Other LMS output 
and other types of statistical analysis might provide additional (or different) insights. For example, we 
did not have any data on the number or timeliness of assignments submitted, which might have 
strengthened the attendance data since completing assignments strongly suggests course engagement. 
In addition, we did not have data from other types of student-student interaction, either inside the LMS 
(i.e., group work) or outside via Skype or in wikis or offline in study groups. Second, and equally 
important, we need to recognize that each behaviour variable may not represent the same behaviour 
for everyone. Attendance behaviours such as time in the system do not necessarily equate with time on 
task while there: some students may work offline and use their online time efficiently while others may 
login but not work steadily once there (Kovanovic et al., 2015). 

The next step in our research is to determine if these findings apply to the students in the next cohort. 
But in addition to replication, we need research on other age groups and populations, as well as 
qualitative analyses, in order to confirm (or not) the differences between attendance and interactivity 
behaviours, to determine in what circumstances gender plays a role, and in general to expand our 
understanding of how online behaviours are related to course outcomes. 
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APPENDIX 

When we compare CFA Model 1 (without gender) to CFA Model 2 (gender considered), the smaller 
AIC/BIC information criteria suggest that the model that considers gender fits the data better. (For CFA 
Model 2, the Chi-square contribution was 9.610 for males and 12.306 for females.) Note that although 
CFA Model la (for females only) and CFA Model lb (for males only) fit the data well, they are only for 
reference because breaking the students into two groups would introduce Type 1 error. 


Table 1. Model fit indices (CFA Models 1-4). 


Model 

AIC 

BIC 

x 2 

df 

Sig. of X 2 

RMSEA (90% Cl) 

CFI 

TLI 

SRMR 

CFA Model 1 

7583.644 

7665.278 

4.006 

2 

.135 

.038 (.000-.093) 

.999 

.995 

.006 

CFA Model la 

4293.416 

4364.528 

2.881 

2 

.237 

.034 (.000-.113) 

.999 

.996 

.007 

CFA Model lb 

3210.583 

3277.549 

2.130 

2 

.345 

.015 (.000 .115) 

1.000 

.999 

.006 

CFA Model 2 

7504.904 

7631.891 

21.196 

12 

.039 

.049 (.011-.081) 

.995 

.992 

.028 


When we compare SEM Model 1 to SEM Model 2, the model fit indices suggest that the model 
considering gender provide a better fit. (For SEM Model 2, the Chi-square contribution was 16.260 for 
males and 13.025 for females.) 


Table 2. Model fit indices (SEM Models 1 and 2). 


Model 

X 2 

df 

Sig. of X 2 

RMSEA (90% Cl) 

CFI 

TLI 

SRMR 

SEM Model 1 (no gender) 

13.690 

6 

.033 

.040 (.011-.069) 

.995 

.988 

.339 

SEM Model 2 (with gender) 

29.285 

20 

.082 

.034 (.000-.059) 

.994 

.990 

.641 


When we compare SEM Model 3 to SEM Model 4, the model fit indices suggest that the model 
considering gender provide a better fit. (For SEM Model 4, the Chi-square contribution was 11.460 for 
males and 12.233 for females.) 


Table 3. Model fit indices (SEM Models 3 and 4). 


Model 

X 2 

df 

Sig. of X 2 

RMSEA (90% Cl) 

CFI 

TLI 

SRMR 

SEM Model 3 (no gender) 

17.845 

5 

.003 

.057 (.030-.086) 

.992 

.975 

.335 

SEM Model 4 (with gender) 

23.694 

16 

.096 

.035 (.000-.062) 

.995 

.990 

.492 


Below are the model fit indices (Table 4), as well as the conceptual model and parameter estimates 
(Figure 1), for the alternative model: 


Table 4. Model fit indices (SEM alternative model). 


Model 

X 2 

df 

Sig. of X 2 

RMSEA (90% Cl) 

CFI 

TLI 

SRMR 

Gender as factor 

17.710 

10 

.060 

031 (.000- .054) 

.995 

.990 

.414 
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Figure 1. Standardized parameter estimates for the alterative model (gender difference). 
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